{smcl}
{* 02Nov2002}{...}
{hline}
{hi:help overid}{right:(SJ7-4: st0030_3; SJ5-4: st0030_2;}
{right:SJ4-2: st0030_1; SJ3-1: st0030)}
{hline}

{title:Title}

{p2colset 5 15 17 2}{...}
{p2col:{hi:overid} {hline 2}}Calculate tests of overidentifying restrictions after ivregress, ivreg2, ivprobit, ivtobit, and reg3{p_end}
{p2colreset}{...}


{title:Syntax}

{p 8 14 2}{cmd:overid}{bind: [{cmd:,} {cmd:chi2}} {cmd:dfr} {cmd:f} {cmd:all} {cmd:depvar(}{it:varname}{cmd:)}]

{p 8 8 2}{cmd:overid} may be used after IV estimation with
{cmd:aweight}s, {cmd:fweight}s, and {cmd:iweight}s;
see {help weight}.


{title:Description}

    {title:Instrumental variables regression}

{p 4 4 2}{cmd:overid} computes versions of Sargan's (1958) and Basmann's
(1960) tests of overidentifying restrictions for a regression estimated with 
instrumental variables (IV) in which the number of instruments exceeds the number
of regressors; that is, for an overidentified equation.  These are tests of
the joint null hypothesis that the excluded instruments are valid instruments,
i.e., uncorrelated with the error term and correctly excluded from the
estimated equation.  A rejection casts doubt on the validity of the
instruments.

{p 4 4 2}For single-equation (limited-information) IV 
regression (as implemented in {cmd:ivregress} or {cmd:ivreg2}), write the full
set of instruments as Z and the residuals from the IV estimation as u, let P
represent the "projection matrix" Z*inv(Z'Z)*Z', and let M=I-P, where I is the
identity matrix.  

{p 4 4 2}Sargan's (1958) statistic = u'Pu / (u'u/N)

{p 4 4 2}Basmann's (1960) statistic = u'Pu / (u'Mu/(N-L))

{pstd}
where N is the number of observations, L the number of instruments,
K the number of regressors, and L-K the number of overidentifying restrictions.
The statistics share the same numerator.  The denominators can be
interpreted as two different estimates of the error variance of the estimated
equation, both of which are consistent (see Davidson and MacKinnon
1993, 235-236).

{p 4 4 2}Both statistics are distributed as chi-squared with L-K degrees of
freedom.  Both can be calculated with an artificial regression of the residuals
of the IV estimation regressed on the full set of instruments; the Sargan
statistic is N * the uncentered R-sq from this regression.  See
Davidson and MacKinnon (1993, 236) and Wooldridge (2002, 123). 

{p 4 4 2}If there are no overidentifying restrictions (i.e., for 
exact identification, where the number of excluded instruments equals the
number of right-hand endogenous variables), an error message is printed.

{p 4 4 2}The version of this test that is robust to heteroskedasticity in the
errors is Hansen's J statistic; under the assumption of conditional
homoskedasticity, Sargan's statistic becomes Hansen's J (see Hayashi 2000,
227-228), and hence the two statistics are sometimes referred to as the
Hansen-Sargan statistic.  Robust overidentification statistics are available
with {cmd:ivreg2}. {cmd:overid} will not produce a result if
either the {cmd:robust} or {cmd:cluster()} options are used in the preceding
IV regression.  {cmd:ivreg2} also provides diff-Sargan or C tests for the
endogeneity of a subset of instruments; see {helpb ivreg2} (if installed)
for details.

{p 4 4 2}The test will fail to run if N<L. For Z'Z to be of full rank, N>L.

    {title:IV probit and tobit}

{p 4 4 2}{cmd:overid} will report an overidentification statistic after
estimation by {cmd:ivprobit} and {cmd:ivtobit} with the {cmd:twostep} option.
These Stata commands request Newey's (1987) minimum-distance (or
minimum-chi-squared) IV probit and IV tobit estimators, respectively.  Lee
(1992) shows that the minimized distance for these estimators provides a test
of overidentifying restrictions.  Like Sargan and Basmann single-equation
statistics, the test statistic is distributed as chi-squared with (L-K) degrees
of freedom under the null that the instruments are valid.  The test statistic
is available after twostep estimation only.

    {title:Three-stage least squares}

{p 4 4 2}{cmd:overid} will report an overidentification statistic after system
estimation with {cmd:reg3}. As Davidson and MacKinnon (2004, 532) indicate, a
Hansen-Sargan test of the overidentifying restrictions is based on the 3SLS
criterion function evaluated at the 3SLS point and interval parameter
estimates. Under the null hypothesis, the statistic is distributed chi-squared
with (G*L - K) degrees of freedom, where G is the number of simultaneous
equations. The procedure will take proper account of linear constraints on the
parameter vector imposed during estimation. 

    {title:General comments}

{p 4 4 2}The command displays the test statistics, degrees of freedom and
p-value, and places values in the return array. Type {cmd:return list} for
details.

{p 4 4 2}A full discussion of these computations and related topics can be
found in Baum, Schaffer, and Stillman (2003, 2007).
A version of this routine by Schaffer and Stillman 
that works in the context of panel data is available as {cmd:xtoverid}.


{title:Options}

{pstd}Options {cmd:chi2}, {cmd:dfr}, {cmd:f}, and {cmd:all} only pertain to use
of {cmd:overid} after {cmd:ivregress} or {cmd:ivreg2}.

{p 4 8 2}{cmd:chi2} requests Sargan's and Basmann's chi-squared statistics;
this is the default.

{p 4 8 2}{cmd:dfr} is equivalent to {cmd:chi2}, except that the
the Sargan statistic has a small-sample correction:
u'Pu / (u'u/(N-K)).

{p 4 8 2}{cmd:f} requests the pseudo-F test versions of
the Sargan and Basmann statistics.{p_end}
{p 8 8 2}Sargan pseudo-F  = u'Pu/(L-K) / (u'u/(N-K)){p_end}
{p 8 8 2}Basmann pseudo-F = u'Pu/(L-K) / (u'Mu/(N-L)){p_end}

{p 4 8 2}{cmd:all} specifies that all five statistics are to be reported.

{p 4 8 2}{opt depvar(varname)} must be used after {cmd:ivprobit}, version 1.1.8 or
earlier, to specify the dependent variable of the estimated equation.


{title:Examples}

{phang}{stata "sysuse auto" : . sysuse auto}

{phang}{stata "ivregress price mpg (weight turn=length displacement gear_ratio trunk)" : . ivregress price mpg (weight turn=length displacement gear_ratio trunk)}

{phang}{stata "overid" : . overid}

{phang}{stata "overid, all" : . overid, all}

{phang}{stata "ivprobit foreign displacement (mpg=length weight turn), twostep" : . ivprobit foreign displacement (mpg=length weight turn), twostep}

{phang}{stata "overid, depvar(foreign)" : . overid, depvar(foreign)}

{phang}{stata "ivtobit gear_ratio displacement (mpg=length weight turn) [fw=rep78], twostep ll(2.2)": . ivtobit gear_ratio displacement (mpg=length weight turn) [fw=rep78], twostep ll(2.2)} 

{phang}{stata "overid" : . overid}

{phang}{stata "webuse klein" : . webuse klein}

{phang}{stata "constraint define 1 [consump]wagepriv = [consump]wagegovt" : . constraint define 1 [consump]wagepriv = [consump]wagegovt}

{phang}{stata "constraint define 2 [consump]govt = [wagepriv]govt" : . constraint define 2 [consump]govt = [wagepriv]govt}

{phang}{stata "reg3 ( consump wagepriv wagegovt govt invest) ( wagepriv consump govt capital1 taxnetx)": . reg3 ( consump wagepriv wagegovt govt invest) ( wagepriv consump govt capital1 taxnetx)}

{phang}{stata "overid" : . overid}

{phang}{stata "reg3 ( consump wagepriv wagegovt govt invest) ( wagepriv consump govt capital1 taxnetx), c(1 2)": . reg3 ( consump wagepriv wagegovt govt invest) ( wagepriv consump govt capital1 taxnetx), c(1 2)}

{phang}{stata "overid" : . overid}


{title:References}

{p 4 8 2} Basmann, R. L. 1960. On finite sample distributions of generalized
classical linear identifiability test statistics.
{it:Journal of the American Statisical Association} 55: 650-59.

{p 4 8 2}Baum, C. F., M. E. Schaffer, and S. Stillman. 2003.
Instrumental variables and GMM: Estimation and testing.
Stata Journal 3: 1-31.

{p 4 8 2}Baum, C. F., M. E. Schaffer, and S. Stillman. 2007. Enhanced routines
for instrumental variables/generalized method of moments estimation and
testing. {it:Stata Journal} 7: 465-506.

{p 4 8 2} Davidson, R., and J. G. MacKinnon. 1993.
{it:Estimation and Inference in Econometrics}.
New York: Oxford University Press.

{p 4 8 2} Davidson, R., and J. G. MacKinnon. 2004.
{it:Econometric Theory and Methods}.
New York: Oxford University Press.

{p 4 8 2}Hayashi, F. 2000. {it:Econometrics}. Princeton: Princeton University
Press.

{p 4 8 2}Lee, L. 1992. Amemiya's generalized least squares and tests of
overidenfication in simultaneous equation models with qualitative or
limited dependent variables. {it:Econometric Reviews} 11: 319-328.

{p 4 8 2}Newey, W. K. 1987. Efficient estimation of limited dependent variable
models with endogeneous explanatory variables. {it:Journal of Econometrics}
36: 231-250.

{p 4 8 2}Sargan, J. D. 1958. The estimation of economic relationships using instrumental
variables.  {it:Econometrica} 26: 393-415.

{p 4 8 2}Wooldridge, J. M. 2002. {it:Econometric Analysis of Cross Section and Panel Data}.
Cambridge, MA: MIT Press.


{title:Authors}

	Christopher F Baum, Boston College, USA
	baum@bc.edu

	Mark E Schaffer, Heriot-Watt University, UK
	m.e.schaffer@hw.ac.uk

	Steven Stillman, Motu, New Zealand
	stillman@motu.org.nz
	
	Vince Wiggins, Stata Corporation, USA
	vwiggins@stata.com


{title:Citation}

{p 4 4 2}{cmd:overid} is not an official Stata command. It is a free
contribution to the research community, like a paper. Please cite it as such:
{p_end}

{phang}Baum, C. F., M. E. Schaffer, S. Stillman, and V. Wiggins.  2006.
overid: Stata module to calculate tests of overidentifying restrictions after
ivregress, ivreg2, ivprobit, ivtobit, and reg3. Boston College Department of
Economics, Statistical Software Components S396802. Downloadable from
{browse "http://ideas.repec.org/c/boc/bocode/s396802.html":http://ideas.repec.org/c/boc/bocode/s396802.html}.{p_end}



{title:Also see}

{psee}Manual:  {hi:[R] ivregress}, {hi:[R] ivprobit}, {hi:[R] ivtobit}, {hi:[R] reg3}{p_end}

{psee}Online:  {helpb ivregress}; {helpb ivreg2} (if installed);
{helpb ivprobit}; {helpb ivtobit}; {helpb reg3};
{helpb xtoverid} (if installed){p_end}
